Model Selection

Generative Image Captioning

# Generative Image Captioning

Git Large Msrvtt Qa

GIT is a dual-condition Transformer decoder based on CLIP image tokens and text tokens, specifically fine-tuned for the MSRVTT-QA task.

Transformers Supports Multiple Languages

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase